Scientific IPython Setup

We need Python's plotting library, matplotlib. Your environment might load matplotlib automatically, but for this tutorial I'll load it explicitly using this convention. If you are unfamiliar with matplotlib, do the same as I do here, and everything that follows will work without modification.


In [2]:
%matplotlib inline
import matplotlib.pyplot as plt

We also might want to use scientific Python libraries. Finally, we'll import mr itself.


In [1]:
import numpy as np
import pandas as pd
from pandas import DataFrame, Series  # for convenience

import mr

Optionally, tweak the plotting styles.


In [29]:
plt.rcParams['figure.figsize'] = 10, 6
plt.rcParams['image.cmap'] = 'gray'

Locate Features

Open a video

Load a video file as a frames. This object is an iterator that returns a frame each time as its called. The frame is a numpy.ndarray of brightness values.


In [3]:
frames = mr.Video('../mr/tests/water/bulk-water.mov')

In [4]:
frames


Out[4]:
<Frames>
Source File: ../mr/tests/water/bulk-water.mov
Frame Dimensions: 640 x 424
Cursor at Frame 0 of 480

The Video constructor assumes brightfield video, inverting black and white. (See image below.) To process video of fluorescent particles, include the keyword argument invert=False.


In [5]:
mr.play(frames) # spawn a window playing the video

Find features in one frame.

Take a look at, say, Frame 5.


In [30]:
plt.imshow(frames[5], cmap=plt.cm.gray)


Out[30]:
<matplotlib.image.AxesImage at 0x1203f1bd0>

Locate bright features with a diameter of about 11 px. The size must be an odd integer, and it is better to err on the large side. We will see why below.


In [32]:
f = mr.locate(frames[5], 11)

mr.locate returns a spreadsheet-like object called a DataFrame. It lists

  • each feature's position,
  • various characterizations of its appearance, which we will use to filter out spurrious features,
  • the "signal" strength and an estimate of uncertainty, both derived from this paper

(DataFrames are provided by the pandas module, and you can read more about them in the pandas documentation. They can easily be exported to a variety of formats.)


In [8]:
f.head() # shows the first few rows of data


Out[8]:
x y mass size ecc signal ep
0 69.227507 5.805180 3755 2.846401 0.066815 21.609637 0.095450
1 205.860481 6.449213 4414 2.908863 0.149883 24.609637 0.080542
2 336.196881 6.378277 2264 2.133259 0.082158 24.609637 0.146473
3 36.281895 8.087197 4262 2.920783 0.099597 22.609637 0.087016
4 277.321270 7.803879 2729 2.808470 0.160056 14.609637 0.144733

5 rows × 7 columns


In [33]:
mr.annotate(f, frames[5])


Out[33]:
<matplotlib.axes.AxesSubplot at 0x120b66610>

Many of these are spurious; they are fleeting peaks in brightness that aren't actually particles. There are many ways to distinguish real particles from spurrious ones. The most important way is to look at total brightness ("mass").


In [34]:
ax = f['mass'].hist()
ax.set_xlabel('mass')
ax.set_ylabel('count')


Out[34]:
<matplotlib.text.Text at 0x120b4bfd0>

In [11]:
f = mr.locate(frames[5], 11, minmass=2000)

In [35]:
mr.annotate(f, frames[5])


Out[35]:
<matplotlib.axes.AxesSubplot at 0x120b455d0>

There are more options in mr.locate, which you can review in the docstring using help(mr.locate) or mr.locate?.


In [13]:
help(mr.locate)


Help on function locate in module mr.feature:

locate(image, diameter, minmass=100.0, maxsize=None, separation=None, noise_size=1, smoothing_size=None, threshold=1, invert=False, percentile=64, topn=None, preprocess=True)
    Locate Gaussian-like blobs of a given approximate size.
    
    Preprocess the image by performing a band pass and a threshold.
    Locate all peaks of brightness, characterize the neighborhoods of the peaks
    and take only those with given total brightnesss ("mass"). Finally,
    refine the positions of each peak.
    
    Parameters
    ----------
    image : image array (any dimensions)
    diameter : feature size in px
    minmass : minimum integrated brightness
        Default is 100, but a good value is often much higher. This is a
        crucial parameter for elminating spurrious features.
    maxsize : maximum radius-of-gyration of brightness, default None
    separation : feature separation, in pixels
        Default is the feature diameter + 1.
    noise_size : width of Gaussian blurring kernel, in pixels
        Default is 1.
    smoothing_size : size of boxcar smoothing, in pixels
        Default is the same is feature separation.
    threshold : Clip bandpass result below this value.
        Default 1; use 8 for 16-bit images.
    invert : Set to True if features are darker than background. False by
        default.
    percentile : Features must have a peak brighter than pixels in this
        percentile. This helps eliminate spurrious peaks.
    topn : Return only the N brightest features above minmass. 
        If None (default), return all features above minmass.
    preprocess : Set to False to turn out automatic preprocessing.
    
    Returns
    -------
    DataFrame([x, y, mass, size, ecc, signal])
        where mass means total integrated brightness of the blob,
        size means the radius of gyration of its Gaussian-like profile,
        and ecc is its eccentricity (1 is circular).
    
    See Also
    --------
    batch : performs location on many images in batch
    
    Notes
    -----
    This is an implementation of the Crocker-Grier centroid-finding algorithm.
    [1]_
    
    References
    ----------
    .. [1] Crocker, J.C., Grier, D.G. http://dx.doi.org/10.1006/jcis.1996.0217

All functions in mr are documented in this way.

Check for subpixel accuracy.

As Eric Weeks points out in the tutorial for his tutorial, a quick way to check for subpixel accuracy is to check that the decimal part of the x and/or y positions are evenly distributed. mr provides a convenience function for this:


In [36]:
mr.subpx_bias(f)


Out[36]:
<matplotlib.axes.AxesSubplot at 0x120b3a650>

If we use a mask size that is too small, the histogram often shows a dip in the middle.


In [37]:
mr.subpx_bias(mr.locate(frames[5], 7))


Out[37]:
<matplotlib.axes.AxesSubplot at 0x11f9d7350>

Find features in all frames.

You can select:

  • all frames, frames[:]
  • a range of frames, like frames[100:200]
  • all frames after a starting frames, like frames[100:]
  • a list of specific frames, like frames[[100, 107, 113]]

We'll locate features in the first 300 frames from this video. We use mr.batch, which calls mr.locate on each frame and collects the results.


In [16]:
f = mr.batch(frames[:300], 11, minmass=2000)


Frame 300: 455 features

As each frame is analyzed, mr.batch reports the frame number and how many features were found. If this number runs unexpectedly low or high, you may wish to interrupt it and try different parameters.

If you have OpenCV installed (optional) you can view the circled features as a video.


In [17]:
mr.circle(f, frames[:300])  # opens a separate window


Press Ctrl+C to interrupt video.

We have the locations of the particles in each frame. Next we'll associate each particle with an ID number and track it from frame to frame.

First, we must must specify a maximum displacement, the farthest a particle is likely to travel between frames. We must not underestimate how far a particle might travel, but we should choose the smallest reasonable value because a large value slows computation time considerably. In this case, 5 pixels is reasonable.

Second, we allow for the possibility that a particle might be missed in one frame and then seen again in the next frame. (Perhaps its "mass" slipped below our cutoff due to noise in the video.) Memory keeps track of disappeared particles and maintains their ID for up to some number of frames after their last appearance. We'll choose 3.


In [18]:
t = mr.link(f, 5, memory=3)


Frame 300: 455 trajectories present

The result contains the information from the features f with an additional column, probe, identifying each feature with a label.


In [20]:
t.head()


Out[20]:
x y mass size ecc signal ep frame probe
0 233.285353 13.437998 3558 2.414065 0.120121 19.458034 0.148157 0 0
1 233.311850 13.818745 3638 2.559517 0.048906 19.536512 0.130360 1 0
2 233.787198 14.607853 3261 2.413864 0.097972 19.515001 0.151275 2 0
3 234.489771 14.494402 3466 2.456723 0.086164 19.572660 0.140946 3 0
4 234.400785 14.219731 3568 2.563488 0.098350 19.575473 0.131169 4 0

5 rows × 9 columns

Filter spurrious trajectories.

We have more filtering to do. Empheremeral trajectories -- seen only for a few frames -- are usually spurrious and never useful. The convenience function filter_stubs keeps only trajectories that last for a given number of frames.


In [23]:
t1 = mr.filter_stubs(t, 50)
# Compare the number of probes in the unfiltered and filtered data.
t['probe'].nunique(), t1['probe'].nunique()


Out[23]:
(3879, 715)

We can also filter trajectories by their appearance. At this stage, with trajectories linked, we can look at a feature's "average appearance" throughout its trajectory, giving a more accurate picture.


In [38]:
mr.mass_size(t1.groupby('probe').mean())  # convenience function -- just plots size vs. mass


Out[38]:
<matplotlib.axes.AxesSubplot at 0x121ea5c50>

The probes with especially low mass or especially large size are probably out of focus or aggregated, respectively. It is best to experiment by trial and error, filtering out regions of mass-size space and looking at the results using mr.annotate and mr.circle. In the end, we need to separate the good probes from the spurrious ones, and it doesn't matter how we get it done.


In [47]:
condition = lambda x: (x['mass'].mean() > 2800) & (x['size'].mean() < 3.0) & (x['ecc'].mean() < 0.1)
t2 = mr.filter(t1, condition)  # a wrapper for pandas' filter that works around a bug in v 0.12

In [48]:
mr.annotate(t2[t2['frame'] == 0], frames[0])


Out[48]:
<matplotlib.axes.AxesSubplot at 0x12057fe90>

Trace the trajectories.


In [49]:
mr.plot_traj(t1)


Out[49]:
<matplotlib.axes.AxesSubplot at 0x1243f8dd0>

Remove overall drift

Compute the overall drifting motion, which we will subtract away, adopting the reference frame of the probes' average position.


In [50]:
d = mr.compute_drift(t1)

In [51]:
d.plot()


Out[51]:
<matplotlib.axes.AxesSubplot at 0x122847e90>

In [52]:
tm = mr.subtract_drift(t1, d)

With the overall drifting motion subtracted out, we plot the trajectories again. We observe nice random walks.


In [53]:
mr.plot_traj(tm)


Out[53]:
<matplotlib.axes.AxesSubplot at 0x11f019450>

Analyze trajectories

Mean Squared Displacement of Individal Probes

Compute the mean squared displacement of each particle and plot MSD vs. lag time.


In [54]:
im = mr.imsd(tm, 100/285., 24) # microns per pixel = 100/285., frames per second = 24

In [62]:
im.plot(loglog=True, style='k-', alpha=0.1, legend=False) # black lines, semitransparent, no legend
plt.gca().set_ylabel(r'$\langle \Delta r^2 \rangle$ [$\mu$m$^2$]');


Since we only analyzed 300 frames, the statistics are poor at large lag times. With more frames, we can study larger lag times.

Ensemble Mean Squared Displacement


In [57]:
em = mr.emsd(tm, 100/285., 24)

In [58]:
em.plot(loglog=True, style='o')


Out[58]:
<matplotlib.axes.AxesSubplot at 0x122c12d50>

We can easily fit this to a power law using a convenience function, fit_powerlaw, that performs a linear regression in log space.


In [63]:
em.plot(loglog=True, style='ro')
mr.utils.fit_powerlaw(em)


hi
Out[63]:
n A
msd 1.046943 1.681362

1 rows × 2 columns

In water, a viscous material, the expected power-law exponent $n = 1$. The value of $A = 4D$, where $D$ is the particles' diffusivity. $D$ is related to viscosity $\eta$, particle radius $a$, and temperature $T$ as

$D = \displaystyle\frac{kT}{6\pi\eta a}$.

For paritcles with a 1 $\mu\text{m}$ diameter in room-temperature water, $A \approx 1.74$. Our value above is not far off.